An Approach for Document Clustering using Agglomerative Clustering and Hebbian-type Neural Networkx
نویسندگان
چکیده
Clustering is a useful method that categorizes a large quantity of unordered text documents into a small number of meaningful and coherent collections, thereby providing a basis for instinctive and informative navigation and browsing mechanisms. Different type of distance functions and similarity measures have been used for clustering, such as squared, cosine similarity, Euclidean distance and relative entropy. This paper presents text document space dimension reduction in text document retrieval by agglomerative clustering and Hebbian-type neural network. Hebbian-type neural network reduce document space to two dimensions so each document is represented as a point in the reduced document space. Furthermore, the clusters are formed in compact document space.
منابع مشابه
A Joint Semantic Vector Representation Model for Text Clustering and Classification
Text clustering and classification are two main tasks of text mining. Feature selection plays the key role in the quality of the clustering and classification results. Although word-based features such as term frequency-inverse document frequency (TF-IDF) vectors have been widely used in different applications, their shortcoming in capturing semantic concepts of text motivated researches to use...
متن کاملNGTSOM: A Novel Data Clustering Algorithm Based on Game Theoretic and Self- Organizing Map
Identifying clusters is an important aspect of data analysis. This paper proposes a noveldata clustering algorithm to increase the clustering accuracy. A novel game theoretic self-organizingmap (NGTSOM ) and neural gas (NG) are used in combination with Competitive Hebbian Learning(CHL) to improve the quality of the map and provide a better vector quantization (VQ) for clusteringdata. Different ...
متن کاملImplementation of Hybrid Clustering Algorithm with Enhanced K-Means and Hierarchal Clustering
We are propose a hybrid clustering method, the methodology combines the strengths of both partitioning and agglomerative clustering methods. Clustering algorithms that build meaningful hierarchies out of large document collections are ideal tools for their interactive visualization and exploration as they provide data-views that are consistent, predictable, and at different levels of granularit...
متن کاملMode region detection using improved Competitive Hebbian Learning for unsupervised clustering
The goal of this paper is to propose an improved competitive Hebbian learning for mode detection using a new activation function, to overcome its sensitivity to local irregularities in pattern distribution. This method is involved with an unsupervised clustering approach divided into four processing stages. It begins by the estimation of the probability density function, followed by a competiti...
متن کاملAn Empirical Comparison of Distance Measures for Multivariate Time Series Clustering
Multivariate time series (MTS) data are ubiquitous in science and daily life, and how to measure their similarity is a core part of MTS analyzing process. Many of the research efforts in this context have focused on proposing novel similarity measures for the underlying data. However, with the countless techniques to estimate similarity between MTS, this field suffers from a lack of comparative...
متن کامل